DAGGER: Using Instance Selection to Combine Multiple Models Learned from Disjoint Subsets
نویسندگان
چکیده
We introduce a novel instance selection method for combining multiple learned models. This technique results in a single comprehensible model. This is to be contrasted with current methods that typically combine models by voting. The core of the technique, the DAGGER (Disjoint Aggregation using Example Reduction) algorithm selects examples which provide evidence for each decision region within each local model. A single model is then learned from the union of these selected examples. We describe experiments on models learned from disjoint training sets that show: • DAGGER performs as well as weighted voting on this task; • DAGGER extracts examples which are more informative than those that can be selected at random. The experiments were conducted on models learned from disjoint subsets generated with a uniform random distribution. DAGGER is actually designed for use on naturally distributed tasks, with non-random distribution. We discuss how one view of the experimental results suggests that DAGGER should work well on this type of problem.
منابع مشابه
DAGGER: A New Approach to Combining Multiple Models Learned from Disjoint Subsets
We introduce a new technique for combining multiple learned models. This technique results in a single comprehensible model. This is to be contrasted with current methods that typically combine models by voting. The core of the technique, the DAGGER (Disjoint Aggregation using Example Reduction) algorithm selects examples which provide evidence for each decision region within each local model. ...
متن کاملDistributed Learning on Very Large Data Sets
One approach to learning from intractably large data sets is to utilize all the training data by learning models on tractably sized subsets of the data. The subsets of data may be disjoint or partially overlapping. The individual learned models may be combined into a single model or a voting approach may be used to combine the classi cations of a set of models. An approach to learning models in...
متن کاملOnline Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملA Learning-Based Algorithm Selection Meta-reasoner for the Real-Time MPE Problem
The algorithm selection problem aims to select the best algorithm for an input problem instance according to some characteristics of the instance. This paper presents a learning-based inductive approach to build a predictive algorithm selection system from empirical algorithm performance data of the Most Probable Explanation(MPE) problem. The learned model can serve as an algorithm selection me...
متن کاملEnsemble of M5 Model Tree Based Modelling of Sodium Adsorption Ratio
This work reports the results of four ensemble approaches with the M5 model tree as the base regression model to anticipate Sodium Adsorption Ratio (SAR). Ensemble methods that combine the output of multiple regression models have been found to be more accurate than any of the individual models making up the ensemble. In this study additive boosting, bagging, rotation forest and random subspace...
متن کامل